[00:15.040 --> 00:20.280]  We have another great speaker for you today, Dr. Pinky, presenting
[00:21.000 --> 00:24.340]  Discovering ELK, the First Time Lessons Learned Over Two Years.
[00:24.660 --> 00:28.740]  And with that, I will let Dr. Pinky take it away.
[00:29.440 --> 00:33.800]  Hey guys, thank you so much for joining me today. I know it's a little late in the evening.
[00:34.160 --> 00:37.560]  I know things start a little bit late with Pacific, but we're kind of good in the afternoon crash,
[00:37.560 --> 00:42.060]  evening crash, depending on your time zone. So hang in with me and let's have some fun, OK?
[00:42.840 --> 00:46.220]  So about me real fast, other than the fact that I have pink hair, obviously.
[00:46.740 --> 00:51.980]  I'm a Linux and Windows threat hunter. I prefer Linux, but for some reason, Windows seems to have the marketplace.
[00:51.980 --> 00:56.480]  Don't know why. Can't be its usability or anything. So I end up there most of the time.
[00:56.780 --> 01:00.600]  I'm a huge fan of scotch, whiskey, and bourbon. You know, the main things you actually care.
[01:01.200 --> 01:05.780]  I'm over here in San Antonio, Texas, and as a result, I help coordinate various groups here,
[01:05.780 --> 01:11.420]  including B-Side San Antonio and Saha. And of course, the mandatory disclaimer,
[01:11.420 --> 01:16.580]  all views, thoughts, opinions, anything else I say here are my own and are representative of my employer.
[01:17.080 --> 01:21.340]  With that out of the way, let's go into what we're talking about today.
[01:21.500 --> 01:26.080]  First off, I'll just lay out some quick expectations, just so we're all on the same page.
[01:26.420 --> 01:31.260]  And I know most of you, since we're blue teamers for the most part here, are probably familiar with Elastic Stack,
[01:31.260 --> 01:35.820]  but I'm going to get into it just so we're using all the same terminology with the same understanding.
[01:36.640 --> 01:41.300]  And then we'll go into four fun lessons I learned the hard way these last two years.
[01:41.300 --> 01:46.020]  And I'll admit, some of them are pretty embarrassing. So hopefully you get a few good laughs out of this.
[01:46.500 --> 01:53.720]  So with that, let's get into it. So after two years, you may think, hey, I should be an expert by now.
[01:53.720 --> 01:57.580]  But no way. This keeps changing every day, it feels like.
[01:57.580 --> 02:03.600]  And every time I think I get a grasp of something, it changes. So that's the fun thing about our community, right?
[02:04.300 --> 02:09.880]  This talk is geared toward analysts, threat hunters, things of that nature, not toward building or deploying stacks.
[02:09.880 --> 02:16.680]  So anything I say here could, you know, crash your product or your, you know, stack and product.
[02:16.980 --> 02:20.960]  Sorry, life stack. Don't do it. Play first, then deploy.
[02:21.640 --> 02:24.440]  This is aimed at giving you a starting point to research and play.
[02:24.440 --> 02:28.640]  Again, please don't go, hey, let's go live with this right away.
[02:29.300 --> 02:33.800]  And again, this assumes basic familiarity with the Elastic Stack.
[02:33.840 --> 02:37.540]  So with that out of the way, what is the Elastic Stack?
[02:37.600 --> 02:43.700]  The most basic definition you could give is that the Elastic Stack consists of multiple open source projects,
[02:43.700 --> 02:51.800]  which, when used together, takes data from any source and enables search, analysis, and the creation of visualizations and dashboards.
[02:51.800 --> 02:56.260]  It was originally called the Elk Stack, and I've noticed it's still constantly used today.
[02:56.640 --> 03:03.320]  And it was called that way because it was originally just Elastic, Logstash, and Kibana for those open source projects.
[03:03.720 --> 03:08.020]  However, back in version 5, they introduced Beats.
[03:08.020 --> 03:11.140]  And Beats is a fantastic thing we'll get into briefly in this talk.
[03:11.140 --> 03:15.120]  But in summary, it's just a way to gather your data and ship it for you.
[03:15.120 --> 03:16.620]  So you don't have to worry about that.
[03:16.820 --> 03:20.840]  And for some reason, I guess I didn't like the sound of elk or anything like that.
[03:20.840 --> 03:22.520]  So it got called the Elastic Stack.
[03:24.820 --> 03:28.980]  So what is the Elastic Stack and its most basic components today?
[03:29.340 --> 03:32.020]  Note that I'm just going over the four primary ones.
[03:32.020 --> 03:34.620]  You can add many more things to this.
[03:34.760 --> 03:38.660]  But again, this is just so we can understand the basic terms.
[03:39.160 --> 03:44.100]  You have Kibana, which is what analysts, threat hunters, whatever, are going to be the most used to.
[03:44.100 --> 03:49.160]  It's the web interface. It's your visualization tool that lets you, again, visualize, search, and navigate.
[03:49.160 --> 03:51.220]  Just a handy little web GUI.
[03:51.420 --> 03:54.060]  And all this data is actually stored in Elastic.
[03:54.320 --> 03:56.560]  Elastic is the heart of the stack. Stores the logs.
[03:56.560 --> 03:59.800]  Actually does the really sexy indexing in the background.
[03:59.840 --> 04:03.100]  Does all the things that we take for granted, to be honest.
[04:04.480 --> 04:08.560]  Logstash is the other undervalued aspect, which is the pipeline.
[04:08.680 --> 04:12.620]  So it takes data, parses it, and spits it back out.
[04:12.620 --> 04:17.200]  And more than that, it can actually enrich it, like we'll get into with lesson number two.
[04:18.060 --> 04:25.560]  Beats is located on your actual endpoints and collects configured logs or whatever files you have and sends it to Logstash.
[04:25.880 --> 04:30.500]  So with that out of the way, very first lesson I learned.
[04:31.540 --> 04:34.020]  Isn't Elastic and Kibana the same?
[04:34.020 --> 04:37.780]  For the longest time, I didn't understand how these were different.
[04:38.220 --> 04:43.240]  And hopefully after this, you'll understand that they aren't, if you're like younger me.
[04:43.240 --> 04:48.100]  But why would I ever think this? Because, I mean, come on. Their names are two separate things.
[04:48.120 --> 04:50.840]  I just went over it. It should be obvious.
[04:51.740 --> 04:58.040]  However, in a lot of implementations, not all, Kibana and Elastic can be on the same page.
[04:58.920 --> 05:04.840]  The same webpage, that is. So in the screenshot I grabbed here from GitHub, you can actually see, you have Discover, which is Kibana.
[05:04.880 --> 05:09.680]  Kibana is featured at the top. But down at the bottom, you have Management and Monitoring.
[05:09.680 --> 05:13.480]  And some of that can start getting into Elastic and Logstash.
[05:14.200 --> 05:19.060]  Additionally, if you notice the error up there, it says, error request to Logstash failed.
[05:19.060 --> 05:20.820]  I'm sorry, Elastic Search.
[05:22.160 --> 05:30.420]  And that kind of introduces this confusion to beginners of, wait, if I'm using Kibana, why am I getting errors for Elastic?
[05:30.740 --> 05:33.880]  Again, as you get more familiar with this, it makes a bit more sense.
[05:34.020 --> 05:36.600]  But from a beginner, this can kind of confuse you.
[05:37.680 --> 05:45.600]  Additionally, without fully understanding the full stack, you may think that Elastic is just part of Kibana, which they're different.
[05:46.520 --> 05:49.820]  And to make it worse, when you go to Management, they both have the word index.
[05:49.820 --> 05:53.180]  Again, this should be, you know, kind of easy stuff you get used to.
[05:53.280 --> 05:58.340]  But for someone brand new to the stack, you're just going to go, Kibana, Elastic, cool, same thing.
[05:59.040 --> 06:05.040]  So what is actually happening here is that a normal analyst is probably really going to interact with Elastic itself.
[06:05.040 --> 06:11.480]  That's the goal. You should just be able to do everything you need through Discover, Visualize, and Dashboard, hopefully.
[06:11.540 --> 06:16.480]  Maybe a little bit of machine learning, maybe a little bit of the timeline, maybe a bit of graph.
[06:16.540 --> 06:19.680]  But overall, you should be sticking with the Kibana features itself.
[06:20.140 --> 06:28.880]  As a result, you're probably primarily interacting with Kibana, but occasionally get messages like these with Elastic, and there's more you can run into.
[06:29.700 --> 06:34.520]  Additionally, when Elastic goes down, Kibana doesn't really work.
[06:34.520 --> 06:39.680]  The data disappears and you just get ugly errors, and depending how bad it is, it's spitting errors all the time.
[06:39.680 --> 06:46.220]  And you're like, well, if they're not, you know, together, then why am I seeing those errors?
[06:46.420 --> 06:49.160]  So that causes a lot of confusion to beginners.
[06:50.440 --> 06:56.040]  So as a result, let's try to go into this a little bit more and understand better some of these differences.
[06:56.600 --> 07:01.360]  And for this, I'm just going to go into the three primary features, at least I think are important.
[07:01.360 --> 07:06.560]  The purpose of them, how you access them, and their search capabilities, because guess what?
[07:06.560 --> 07:09.360]  You can search in both of them, which introduces more confusion.
[07:10.060 --> 07:17.080]  So remember, Elastic's purpose in life is to obtain the logs, index them, and hold them in an internal database.
[07:17.580 --> 07:22.620]  Because it's already the database, you can actually have search functionality through it.
[07:22.620 --> 07:31.740]  It's just typically through cURL, or you can interact with a few other ways, like Python, for example, that we'll get into later.
[07:31.900 --> 07:38.820]  Versus Kibana, its purpose is actually just to provide you, the user, the ability to create data visualizations.
[07:39.100 --> 07:41.980]  And again, we can make that a bit more sexy with the dashboards.
[07:42.000 --> 07:44.760]  And it does provide some search capabilities.
[07:44.760 --> 07:49.500]  But if all we cared about was search, we probably wouldn't have ever introduced Kibana.
[07:50.500 --> 07:55.860]  To access Elastic, again, you can use cURL generally through GET requests.
[07:55.880 --> 08:01.100]  However, we typically interact with it with Kibana, which does all this in the background for us.
[08:01.200 --> 08:07.060]  And again, with Kibana, you just open your favorite web browser, navigate to the IP or URL, and life's good.
[08:07.980 --> 08:14.300]  And for the search capability for Elastic, again, you're going to do it through those GET requests or through a RESTful API.
[08:14.300 --> 08:17.480]  And it uses the Elastic Search Query DSL.
[08:17.480 --> 08:23.500]  And if you're used to interacting with that, you're going to have issues when you go to Kibana because it's going to keep scolding you.
[08:23.500 --> 08:29.140]  Hey, stop using that. Use my search capability, which is the Kibana standard query language.
[08:29.220 --> 08:32.500]  You can make it work. It allows you to search it directly.
[08:32.500 --> 08:37.580]  But in general, if you're new at this, just try to get familiar with Kibana standard query language.
[08:37.660 --> 08:41.140]  And again, you just access that through your handy web interface.
[08:42.600 --> 08:47.880]  So, to kind of help summarize this, Elastic is your distributed search engine.
[08:47.880 --> 08:53.860]  It takes in the logs, not even through Logstash. You can just feed in the logs directly in the index system.
[08:54.200 --> 08:59.560]  Again, it can receive directly through HTTP request methods or through Logstash.
[08:59.900 --> 09:04.560]  If you have another replacement for Logstash, that's fine. They'll accept it too.
[09:05.180 --> 09:10.860]  And it's truly the heart, the engine, that helps make searches happen so quickly for Kibana.
[09:10.860 --> 09:16.020]  Kibana does none of the heavy-duty effort to make that happen.
[09:17.120 --> 09:22.640]  So, Kibana, on the other hand, is your browser-based analytics and search dashboard for Elastic.
[09:22.960 --> 09:26.680]  And again, it's what your analyst or you are going to typically interact with.
[09:26.860 --> 09:30.880]  Again, it allows you to conduct your searches, create visualizations, and dashboards.
[09:31.480 --> 09:34.900]  After all this, you may be asking, why do I care?
[09:35.360 --> 09:39.180]  First off, incorrect terminology can, one, make you sound a little silly,
[09:39.180 --> 09:42.540]  two, result in you potentially being locked up behind your back.
[09:43.040 --> 09:48.240]  But mainly, the reason you should care is that understanding the differences
[09:48.240 --> 09:52.580]  and the interaction between the two can help you better improve your troubleshooting skills.
[09:52.580 --> 09:57.140]  That way, when you encounter those errors, you can better understand what needs to happen.
[09:57.280 --> 09:59.800]  Alternatively, you can just submit a better ticket.
[10:00.100 --> 10:04.520]  Additionally, as we'll get into soon, if you can't quite make the right search query,
[10:04.520 --> 10:07.980]  understanding overall the interaction with the stack is going to help you with that.
[10:08.480 --> 10:12.260]  So, even if you think, hey, I'm never going to deploy the stack, I don't care,
[10:12.260 --> 10:15.280]  you should care at least at a very basic level the differences.
[10:16.900 --> 10:23.980]  So, with that out of the way, second thing I still say to this day,
[10:24.500 --> 10:28.460]  why can't I search for, enter the field name, I see it's a field.
[10:28.460 --> 10:32.640]  In this case, host name, I'll admit this was me probably three weeks ago.
[10:33.280 --> 10:37.500]  So, again, me on a new and or existing setup all the time.
[10:38.560 --> 10:44.240]  Take a moment and briefly see if you can figure out what's the problem if you use this.
[10:45.720 --> 10:49.660]  Okay, you got your five seconds. A little yellow triangle.
[10:49.660 --> 10:54.820]  I like calling it the yellow triangle of doom. I'm sure it has a better name, but it's what I call it.
[10:56.020 --> 10:59.480]  So, what is the symptom when you're suffering from this?
[10:59.480 --> 11:03.640]  You know a field exists, but you can't reference it. You go to Kibana, you go and search,
[11:03.640 --> 11:08.660]  and you're like, take this field name and search for this value. And it's like, no records.
[11:08.660 --> 11:13.900]  And you're 100% positive that one, the field exists, and two, something should have that.
[11:14.260 --> 11:19.560]  So, what you do is that you find a record within Kibana, you know, you can just go discover,
[11:19.560 --> 11:23.920]  roll through the records, and look at them until you find that field name you care about.
[11:23.920 --> 11:30.300]  You can then search for that specific field name and its value and still have trouble.
[11:30.520 --> 11:34.580]  At that point, you'll probably notice, hey, I have that little yellow triangle.
[11:35.220 --> 11:40.140]  Roll over it, and you'll most likely see the message, no cached mapping for this field.
[11:41.360 --> 11:46.100]  And the really quick way to just fix this is go into management, go to index patterns,
[11:46.940 --> 11:51.300]  find your index pattern, in this case, just Logstash at the bottom of the screen,
[11:51.300 --> 11:56.360]  and click the little refresh button. That'll fix it. You can search it, and life is happy.
[11:57.320 --> 12:00.420]  But you should be asking yourself, why does this happen?
[12:00.940 --> 12:06.500]  It happens whenever you get either one or more new fields introduced to an index pattern.
[12:07.440 --> 12:13.100]  And remember, your index pattern is going to be the interaction between Elastic and Kibana happening here.
[12:13.100 --> 12:18.300]  So, what happens is that Kibana has a cached version of the mapping given by Elastic,
[12:18.300 --> 12:23.300]  and it doesn't have this field. So, therefore, it doesn't really know how to give you those results.
[12:24.040 --> 12:29.760]  So, refreshing the index pattern allows for the creation of this mapping, fixing it.
[12:30.860 --> 12:36.540]  So, why would this ever happen? You know, we're in this nice, stable organization. This will never happen.
[12:37.940 --> 12:42.640]  Garbage. You're either going to do it through log enrichment, where you might see,
[12:42.640 --> 12:45.740]  or you're interpreting a new log, and you want to improve it somehow.
[12:46.480 --> 12:49.500]  Alternatively, and kind of tied with that, is improved parsing.
[12:49.960 --> 12:54.920]  Let's say you have this message on the right here of the screen, where you have network connection detected,
[12:54.920 --> 13:01.040]  and all this stuff. Depending on how your setup is, this may not actually be its own fielding.
[13:01.360 --> 13:06.140]  So, you might not be able to go through and be like, hey, how do I query the image, or the user,
[13:06.140 --> 13:10.440]  or the protocol, or anything else you see on there? Maybe it's all just in that one message,
[13:10.440 --> 13:13.020]  which makes it kind of ugly for Kibana search purposes.
[13:13.520 --> 13:18.180]  Therefore, maybe you need to go back and improve that parsing and implement that.
[13:18.180 --> 13:23.320]  And when you do that, and you try to query, let's say, the user, it's still not going to work because you didn't refresh it.
[13:24.120 --> 13:29.200]  And also, this can happen if you have a new update in any of the aspects you used,
[13:29.600 --> 13:37.060]  such as if you just upgraded Elastic from an older version to a newer version that uses the Elastic Home schema.
[13:37.060 --> 13:42.140]  You're going to have a lot of new field names, and you're going to need to refresh that.
[13:42.140 --> 13:47.980]  Additionally, if you implemented any custom events that we'll talk about soon, you're also going to run into that.
[13:49.080 --> 13:53.360]  So, don't just refresh blindly. Think about why it happened and if you can figure it out.
[13:55.400 --> 14:03.720]  So, speaking of that, when is this suitable to actually use Logstash for log enrichment or just improving your contextual awareness?
[14:04.180 --> 14:08.440]  To me, it comes down to three primary things, either to improve your search capabilities,
[14:08.440 --> 14:13.800]  such as if you're doing a search on a Windows environment and you really don't want to see the service accounts,
[14:13.800 --> 14:16.240]  you just want to see what the users themselves are doing.
[14:17.260 --> 14:21.960]  Therefore, you could make this really complex query to remove all those IPs or host names,
[14:21.960 --> 14:28.100]  or you can make a complex query to include all the different hosts and their subnets they're in.
[14:28.780 --> 14:30.820]  And again, filter out those service accounts.
[14:32.900 --> 14:38.440]  Or you can just mark all service accounts as they go through Logstash, so the Windows logs.
[14:38.680 --> 14:44.080]  And if it sees that, you can just tag your service account and then you can simply tell Kibana,
[14:44.080 --> 14:47.480]  don't use or exclude all records with this tag.
[14:47.660 --> 14:52.040]  Makes it a lot easier, makes it easier on the analysts, makes sure you make less mistakes,
[14:52.860 --> 14:56.220]  and just overall improves your experience.
[14:56.900 --> 15:00.160]  Additionally, sometimes you just need to transform or fix data.
[15:00.160 --> 15:06.960]  So an example of this is often Linux will send timestamp, maybe in a weird format that you're not using.
[15:06.960 --> 15:10.340]  So your Windows and your Linux may send it in a different format.
[15:11.040 --> 15:15.120]  And as a result, that makes it hard to compare things and be able to see,
[15:15.120 --> 15:18.200]  did something happen at the same time or in a similar window?
[15:18.520 --> 15:23.500]  Alternatively, maybe one of your systems is set up to use UTC, but the other is using local.
[15:23.620 --> 15:27.400]  That also doesn't help. These are all different things you can use to fix.
[15:27.920 --> 15:31.720]  And additionally, why not improve your context?
[15:31.720 --> 15:36.420]  So rather than having you have to go like, OK, let me see if this IP is a known bad IP.
[15:36.440 --> 15:40.160]  Let me see if this URL is a bad URL or something like that.
[15:40.160 --> 15:47.560]  You can add the log session, have a query and be like, hey, is this IP that's coming in or going down part of a botnet?
[15:47.760 --> 15:50.420]  Or is someone visiting a known malware site?
[15:50.460 --> 15:54.560]  There's a lot of things you can do, and I'm not going to get into all of these per se,
[15:54.560 --> 15:59.280]  but these are just examples to try to get you thinking about how can you improve your environment?
[16:01.300 --> 16:06.220]  So in order to understand this, again, yes, you, the analyst, may not ever be doing this.
[16:07.100 --> 16:10.560]  But understand how this works so you can make more reasonable requests.
[16:10.720 --> 16:14.160]  And you're not doing things such as saying, I want to find a bad guy.
[16:14.600 --> 16:18.960]  You get nothing else. You just say, find bad guy. I'm sure we've all heard that a lot.
[16:18.960 --> 16:23.640]  We get irritated. We don't want to do that. We don't want to be the same people saying that.
[16:23.640 --> 16:30.240]  So as a result, if you're asking for something to be fixed or improved, have a better understanding of what is possible.
[16:30.800 --> 16:35.000]  So the overall format of a Logstash configuration files, you have input and output.
[16:35.000 --> 16:39.840]  Those are the two mandatory fields. Input tells Logstash, what port should I listen on?
[16:39.840 --> 16:43.160]  What protocol should I listen on? And how should I interpret it?
[16:43.340 --> 16:48.260]  The output says, once I'm done with it, where do I output it to and where?
[16:49.080 --> 16:55.460]  Additionally, there's a filter section that can go in between, and it's optional.
[16:55.460 --> 16:58.780]  However, all the fun things we just talked about are in there.
[16:59.120 --> 17:09.940]  And just to go a little bit more, you can have TCP, you can have UDP, you can have Beats, things like that for all of how Logstash should be listening to it.
[17:09.940 --> 17:16.000]  The codec, just briefly, is going to be how should I interpret it? JSON, CSV, something like that.
[17:17.360 --> 17:22.660]  Output. Note that here, you can obviously see the hosting, which is what we're primarily going to use.
[17:22.880 --> 17:29.600]  That also means if you see an error, like back in Lesson 1, a VIP can't resolve that, it may be a DNS problem.
[17:30.020 --> 17:34.580]  So again, just trying to tie these together and better improve our understanding of how the whole stack works.
[17:35.780 --> 17:39.000]  So with that in mind, there's a thing you can do called tags.
[17:39.000 --> 17:42.680]  I mentioned this before, like tagging user accounts, tagging the machine.
[17:43.520 --> 17:50.620]  Why would you ever want to do this? Again, it helps provide contextual information, because even if I don't filter for it,
[17:50.620 --> 18:00.200]  if I just see an IPNC, hey, this is my domain controller, or hey, this is a workstation, or hey, this belongs to the commerce or the DMZ subnet.
[18:00.200 --> 18:05.740]  This is all things that helps us quickly go like, okay, I have a better idea of what your purpose in life is.
[18:06.180 --> 18:10.260]  Alternatively, you can just say, hey, this is a Linux system, this is a Windows system.
[18:10.260 --> 18:14.060]  Again, you can probably figure that out with 5 or 10 seconds more thought.
[18:14.280 --> 18:17.900]  But this all add up and make you operate better and more efficiently.
[18:18.600 --> 18:28.300]  You can also identify the top of logs. So if you're wanting to parse only, or if you only want to look for a patchy like error logs to see what's going on, you can do that.
[18:28.300 --> 18:34.500]  Access logs, same thing. You can also, you know, obviously identify admin accounts, user accounts.
[18:34.680 --> 18:38.120]  Pretty much, if you think you can make it happen, you can do it.
[18:39.300 --> 18:51.560]  In this example you see on the slide, it's a really simple, basic one where it says, hey, if you get any logs that come from autobeat, which is something we'll talk about soon, don't worry about it.
[18:51.820 --> 18:56.580]  It's going to add a tag of Linux because autobeat only exists on Linux.
[18:57.300 --> 19:03.760]  Again, you can add a lot more complex information here. This is just to give you the beginnings to make you a little dangerous.
[19:04.240 --> 19:09.280]  Additionally, it's important to look at tags because Logstash will add tags when a parser fails.
[19:09.720 --> 19:19.960]  So even if you didn't set up a custom one, if something's failing along the line, it's going to let you know, hey, this part in particular errored. Go look at it.
[19:21.000 --> 19:21.600]  So.
[19:24.000 --> 19:28.040]  Again, some of the different things you may want to consider about with Logstash in general.
[19:29.100 --> 19:31.780]  And again, this is more about the filter section.
[19:32.380 --> 19:35.600]  Timestamps that aren't sent through Beats may not come in a format you like.
[19:35.600 --> 19:40.220]  They may come in different formats, they may come in the wrong time zone, something like that.
[19:40.500 --> 19:45.380]  Additionally, one thing I particularly like doing is I just want to split the field and make them into a new one.
[19:45.560 --> 19:51.480]  So rather than saying mycomputer.local, maybe I want to see the hostname of my computer in the domain local.
[19:52.140 --> 19:54.700]  Maybe you want to have all three. You can do that.
[19:55.320 --> 20:02.860]  You can remove field and tags. This really kind of aggravates me, but has its use, is that Beats will generate an automated tag.
[20:02.900 --> 20:10.540]  However, this can be a really lengthy tag, so you might want to rename it or you just want to remove it because you don't care that it's some complex thing.
[20:11.720 --> 20:15.020]  And again, the Translate plugin is going to be how you rename the fields.
[20:15.020 --> 20:27.420]  And more than just being like, hey, let me rename this, you know, Beats tag, it can be a lot better and actually do like, let me convert the HTTP codes into their actual meaning so you don't have to keep pulling up a little cheat sheet.
[20:27.460 --> 20:31.820]  As much as you promise you know all the error codes or all the different codes, we don't.
[20:32.660 --> 20:37.640]  So that helps you, again, save that little bit of time and be a bit more efficient.
[20:39.020 --> 20:51.140]  So the overall goal when you're filtering and improving your Logstash pipeline is to improve those logs by adding additional information that would have normally been done manually.
[20:51.900 --> 21:02.880]  Some of the more interesting plugins or plugins you might use regularly is GeoIP, which again adds just geographical info, says, hey, this IP is from China, Russia, America, whatever.
[21:02.880 --> 21:07.080]  Obviously, it's just doing it based on what the servers report.
[21:07.700 --> 21:11.720]  If you're that suspicious about it, double check it, but it can give you that quick insight.
[21:12.380 --> 21:18.160]  DNS Confer standard or reverse lookups, again, handy, saves you that little bit of time.
[21:19.040 --> 21:25.700]  Threats Classifier is one that's really interesting where it can enrich your logs and add a little bit of context based on the MITRE ATT&CK matrix.
[21:26.260 --> 21:32.140]  So again, saving you a little bit of time, obviously, especially with something like that, double check it.
[21:33.100 --> 21:41.520]  And UserAgent helps parse your UserAgent strings into fields, so you can actually search on it a bit more efficiently and not have to do maybe quite so many wildcards.
[21:43.780 --> 21:51.080]  So overall, if you, the analyst, can better understand Logstash, you can make a better reasonable request of new features and improvements.
[21:51.260 --> 21:59.300]  It allows you to better understand the errors, so again, you can submit that ticket or ask your neighbor and be like, hey, I'm getting this error and I think it means this.
[22:00.460 --> 22:07.100]  And overall, you can improve the creation of your dashboards or visualizations through asking to improve those pipelines.
[22:07.440 --> 22:16.280]  Again, I probably gave you here just enough to be dangerous and maybe really slow down your pipeline to avoid too many of those filters, but have fun.
[22:18.220 --> 22:27.120]  So with that out of the way, have you ever found yourself running some custom scripts to obtain information from your endpoints and just wish you could view it in Elastic?
[22:27.120 --> 22:33.460]  Maybe you're just running, hey, I want to get all the processes and import it in, just so I can see.
[22:33.660 --> 22:42.400]  Maybe you're just doing data analytics, any data from an incident response investigation, anything else that's not natively created in the log.
[22:43.500 --> 22:50.720]  Again, with me without coffee, maybe I just want to see a pretty visualization with it rather than just viewing it naturally.
[22:51.220 --> 22:58.040]  If this is you too, you're not alone, and we're going to go into how you actually take these custom scripts and throw it into Elastic.
[22:59.260 --> 23:09.760]  So again, me without caffeine in the morning is like, I'm tired of reading the CSV, it's 7 a.m., it's ugly, and I don't remember how to control us because I haven't had coffee.
[23:11.560 --> 23:18.000]  And I have to compare it to my dashboard. Why can't they be together so I can make better dashboards or just do better searches?
[23:18.620 --> 23:28.500]  Well, you can. First off, if you really hate yourself, you can just curl everything. You could go curl in that CSV and put it into Elastic.
[23:29.680 --> 23:34.200]  And I don't recommend it because at least I seem to forget curl any time over the weekend.
[23:35.040 --> 23:41.360]  You can also use Python. We all love Python. And probably a few of you are now arguing, it's all good.
[23:41.780 --> 23:50.300]  So if you want to use Python, all you have to do is install the Elasticsearch package through pip. You'll connect it to your Elastic, you'll do your steps, and you'll profit.
[23:51.040 --> 23:58.220]  What it actually allows you to do is you can pull data from your indices, and you can make your edits, and then you push back to it.
[23:58.540 --> 24:05.660]  It allows you to actually conduct the searches, because remember, Elastic is more than just storing the information. You can search with it.
[24:06.140 --> 24:15.140]  Overall, it gives you the ability to use Elastic without the bonnet. Again, you're not going to get the visualizations, you're not going to get the dashboards, but there's a lot you can do with it.
[24:16.780 --> 24:28.640]  So, how would you do this? First off, hopefully you have your data in a JSON format. Elastic in general prefers things to be in JSON, even Logstash. Why is that?
[24:28.640 --> 24:41.080]  JSON is already a structured language. You know, it's basically a really fancy key value type of thing. So, it makes it really quick to parse, versus if you have syslog coming in, which we'll talk about soon.
[24:41.440 --> 24:50.860]  That's all separated by spaces, and not all the fields are guaranteed, and there's no way to easily say it's not there. So, JSON in general will speed everything up.
[24:51.360 --> 25:03.680]  So, if you don't have your data already in a JSON format, you're going to have to make it into one. Then, you'll load it into the Python JSON object, and then you'll upload it into Elastic, simply by doing...
[25:03.680 --> 25:16.920]  calling your index, specify the index name, specify the type of the document, give it an ID number, note you want it to probably be unique if you're creating a new one, and then you put your JSON object there.
[25:17.820 --> 25:29.180]  You can do this for every single little thing you do. If you need to update a lot of different indices, that's going to be a lot of copy and paste, and you're probably going to make mistakes.
[25:29.760 --> 25:39.460]  So, alternatively, you can just use Elastic's bulk feature. Again, this isn't Elastic itself, this is the Elastic search package through Python.
[25:39.460 --> 25:48.460]  And what this does is that it allows you to do better performance. Just with one call, you can do multiple indexing and deletion operations, again, just in one API call.
[25:48.960 --> 26:00.140]  So, if you're interested in trying to do this, but you're like, man, I really suck at converting things into JSON, don't worry, I'm with you.
[26:00.140 --> 26:12.240]  So, what you can do is you can go to GitHub and go download the Elasticsearch loader. Again, link's right there on the slide. Fantastic one.
[26:12.540 --> 26:23.140]  What this does is that it'll take JSON, CSV, or Parquet files, and it'll go ahead, convert if needed, and push it for you just in one command.
[26:24.440 --> 26:31.100]  Additionally, you can set up some custom mappings to help make sure your data is interpreted the right way. Why would you want to do this?
[26:31.680 --> 26:41.580]  Do you want the number one to be an integer or a string? Do you want port 80 to be a number or a string? Do you want to be able to do less than or equal to?
[26:41.800 --> 26:50.380]  Those type of things you sometimes want to make sure are interpreted a specific way. Maybe with your IPs, you want it to actually be IPs and not a string.
[26:50.380 --> 26:53.760]  These are all different things you can try to do with the custom mappings.
[26:54.780 --> 27:02.240]  But what is the downside in doing all this I just talked about, whether it's CURL or Python or using the Elasticsearch loader?
[27:02.720 --> 27:15.640]  You can't really do any enrichment or tagging that we just talked about. And again, if you don't set up those custom mappings, which you can do in the Python method as well, it can be incorrect and you're going to have to fix that.
[27:18.040 --> 27:22.700]  So instead, if you really don't want to do that much Python, FileBeat is going to be your friend.
[27:23.180 --> 27:31.160]  What FileBeat does is that it's going to take any logs you specify and forward it either to Elastic or LogStash. It's up to you.
[27:31.400 --> 27:36.820]  And it supports some basic data processing and data enrichments. What's the benefit to that?
[27:37.160 --> 27:42.400]  Well, again, LogStash is already going to be a little bit overwhelmed with anything you're trying to tell it to do.
[27:42.400 --> 27:50.940]  So you can do some of this data enhancements like adding tags on the endpoint, have it do it there, and then send it. Save yourself a little bit of time.
[27:51.020 --> 28:00.940]  Remember, everything you can save improves overall performance. You just need to decide, are your endpoints struggling in processing power? Where's LogStash?
[28:01.080 --> 28:06.520]  There's a bunch of different solutions you can do out there, but we're not necessarily going to get into those nitty gritty details.
[28:07.420 --> 28:09.740]  So overall, how does FileBeat work?
[28:09.740 --> 28:17.680]  First off, you're going to use your favorite file format in the world that Elastic loves to use, or the Elastic Stack, and that's YAML.
[28:19.380 --> 28:23.640]  And YAML is great because you need to specify the spaces specifically, so good luck.
[28:25.020 --> 28:32.340]  So other than you'll crush yourself without having with that, what you do with FileBeat, at least, is again, you'll do...
[28:32.340 --> 28:37.680]  And what I have on the right is a very basic one. It doesn't include everything you need. It's just the configuration.
[28:37.680 --> 28:46.980]  We say for the inputs, I'm going to grab type log. The FileBeat website, depending on what version you have, will tell you what types it supports.
[28:46.980 --> 28:49.540]  And this is just going to be a Linux log type.
[28:50.560 --> 28:58.000]  And what it does is it says, hey, collect type log and collect it from, in this case, var log and anything that ends in .log.
[28:58.840 --> 29:02.240]  So FileBeat reads that and it says, hey, let's do this.
[29:02.240 --> 29:08.060]  And what it does is it starts a harvester, which is just, you know, child processes for each input type.
[29:08.060 --> 29:11.940]  So you can have type log, whatever other types. It's going to start one for each.
[29:13.320 --> 29:16.380]  And additionally, it's going to create one for each file.
[29:16.780 --> 29:22.600]  Don't worry, FileBeat's not going to kill your computer by creating like 100 children. It controls it. It has a mask.
[29:23.480 --> 29:31.460]  What it does is that each one goes ahead, grabs the log, opens it, reads it, sends it to an aggregator.
[29:31.460 --> 29:41.280]  And once it has everything, or when it reaches a certain threshold, it'll send it to either Logstash or Elastic, depending on how you set it up.
[29:42.100 --> 29:52.700]  So the input aspect, again, starts the harvesters and manages them. The harvester, all they care about is opening the file, reading it, and submitting it to the aggregator.
[29:55.060 --> 30:01.580]  So, in the previous slide, you just saw how it ended in star dot log.
[30:02.160 --> 30:15.700]  You're going to have to specify the type of the document and where it is. FileBeat is not recursive, so any additional directors you have, let's say under bar log, you're going to have to specify those and keep going into it.
[30:16.020 --> 30:22.680]  Additionally, with those data enrichment aspects, FileBeat by default is not going to parse those necessarily.
[30:22.700 --> 30:31.340]  Great, it's not going to consider what type of application it is and automatically parse those for you to make it say, hey, this is a string, this is an IP, all that jazz.
[30:32.040 --> 30:39.180]  Instead, what you can do is you can use modules. Modules, the ones I've listed here, are just the beginning. There's a lot more.
[30:39.180 --> 30:46.000]  It includes things like system, which are going to be your typical, like bar log ones, things you see in DMessage, things like that.
[30:46.400 --> 30:53.240]  Nginx for using Nginx, MySQL, IIS, so it's not just Linux, don't worry, and so many more.
[30:53.400 --> 31:01.480]  An example you see on the right, it's Apache. We can enable or disable just the access and error, specify where they are, and it works.
[31:02.640 --> 31:08.660]  So, again, this simplifies your config. All you do is enable it, specify where the path is, and then you're good to go.
[31:09.400 --> 31:14.260]  There's a lot more settings you can specify, and that's all going to be in Elastic Documentation.
[31:16.360 --> 31:21.800]  So, you're probably like, gosh, that's been a while.
[31:22.680 --> 31:28.680]  And you really don't care about the fact that I want to go back to my Linux origins because I'm tired of talking about Windows and that incy little bit of Linux in the slides.
[31:30.020 --> 31:36.100]  And a lot of the verbal examples I've provided have been Windows, and obviously some of this can be converted to networking devices.
[31:36.540 --> 31:38.220]  But what about Linux?
[31:39.820 --> 31:46.740]  With Linux, when we generally forward logs, we use syslog, rsyslog, or syslogng. Typically, those are the most popular ones.
[31:47.640 --> 31:53.520]  Some implementations such as syslogng can forward directly to Elastic. Fantastic.
[31:54.040 --> 31:59.840]  However, others are going to require help, whether using FileBeat or using a Python script, whatever the case may be.
[32:00.700 --> 32:07.600]  But overall, this only helps your services. It only helps you find things that you can configure in there and be able to read from this log file.
[32:08.180 --> 32:13.700]  And you get into a little bit of weird things when you try to talk about, I want to forward things from log in, log out.
[32:13.700 --> 32:20.000]  I want to see my sudo or sus privileges being used. I want to see when someone edits a very particular file.
[32:20.940 --> 32:31.460]  Things like that, it's kind of getting to the point of, can we have sysmon for Linux? Because let's face it, sysmon is great on Windows. It helps us find so many more bad things.
[32:33.440 --> 32:37.320]  Linux doesn't have a super famous way of making this happen.
[32:38.620 --> 32:45.100]  But that's what Audity is for. Audity is available for basically all distros out there of Linux.
[32:45.740 --> 32:51.580]  The only ones I've seen a little bit of issues with are Gentoo and maybe a little bit of Arch, but you can still make them work.
[32:52.540 --> 32:59.140]  And what it does is it allows you to log basically anything. Here on the slide is just some more common things you're going to do.
[32:59.200 --> 33:06.700]  It allows you to log the date, time, and type of outcome of an event. And don't worry, I'm going to define what an event is in just a moment.
[33:06.700 --> 33:14.820]  An association of that event and who triggered the event. Was it Root? Was it user Pinky? Who was it?
[33:15.480 --> 33:19.980]  Any modifications to the audit configuration to try to maybe circumvent it.
[33:20.680 --> 33:28.460]  All uses of authentication mechanisms like SSH or Kerberos. Because remember, Linux can't be joined to your Active Directory network.
[33:29.160 --> 33:33.040]  And any changes to trusted databases like XC Password.
[33:33.040 --> 33:44.440]  And as a little heads up, the little tweet on the slide, that will actually create or generate alerts, or events in this case, that is new process creation, just like SysMob.
[33:45.680 --> 33:54.060]  So, event. I said I was going to explain that. What is it? What it is, is it's the result of a rule being triggered. What's a rule?
[33:54.460 --> 33:59.940]  It's what you see later in the slide, but it's everything you specify in XC audit rules.
[34:00.860 --> 34:07.940]  So, there are two overall types of rules. Files to monitor, and monitoring system calls.
[34:08.540 --> 34:19.780]  So, generally when you want to specify a file, just as a quick, dirty intro, is that you're going to do "-w", for watch, the file to monitor, "-p", any permissions you want to trigger on.
[34:19.800 --> 34:25.180]  And this generally comes into read, write, execute, and change in attributes. So, for.
[34:25.560 --> 34:28.140]  And "-k", what you want the name of the rule to be.
[34:28.140 --> 34:32.860]  Make this something that makes sense, so if it gets fired off, you understand it.
[34:33.480 --> 34:44.020]  An example, in the bullet below, it says, hey, I want to watch Etsy Shadow, and anything that is written to it, or if the attribute changes, notify me with the event name, Shadow.
[34:45.020 --> 34:49.880]  For system calls, it's a little weird, and I'll be honest, every time I see it, I have to go back to the main page.
[34:50.580 --> 34:54.960]  But what it is, is that you have "-a", an action, followed by a list.
[34:54.960 --> 35:03.520]  You have "-S", the system call, "-f", which is the field, which often ends up just being like, hey, specify the architecture.
[35:04.560 --> 35:08.160]  So, is it going to be 32-bit, 64-bit, that type of deal.
[35:08.580 --> 35:11.400]  And then, "-k", the name of the rule.
[35:12.620 --> 35:16.560]  So, action must be one of two values, always or never.
[35:16.820 --> 35:20.000]  So, always alert on it, or never.
[35:20.960 --> 35:28.000]  And then, you have one of the following for the list, task, entry, exit, user, exclude.
[35:28.220 --> 35:34.800]  I find the most common one is always an exit, and what this does, is that it will always alert when that system call exits.
[35:35.880 --> 35:39.520]  So, entry would be the opposite, when that system call is initially called.
[35:40.000 --> 35:49.480]  In this case, it's going to work on 32-bit architectures, and what this means is not just 32-bit distros, but 32-bit calls for the system calls.
[35:50.420 --> 35:54.420]  And all these system calls you see here are all for changing the time.
[35:54.700 --> 36:02.160]  So, adjusting the time, setting the time of day, doing clock set time, those are all different ways you can adjust the time.
[36:02.480 --> 36:08.180]  So, the rule of calls, or the event that will be named, is time in this case.
[36:09.880 --> 36:14.900]  So, how do you get started with AuditD, because it's amazing.
[36:15.220 --> 36:18.580]  First off, you're going to install AuditD. It's already on your system.
[36:18.580 --> 36:23.320]  You're going to edit the config if you need. You're going to create all your rules by hand.
[36:23.960 --> 36:36.600]  No, you're not. Go to that GitHub, download it. It's a fantastic one that has a lot of the most common things you'll need, including things like empty password, the change of time, a lot of things that I catch bad are going to be in there.
[36:37.180 --> 36:42.720]  You're going to restart the service, and you're going to be like, I have errors. She lied to me.
[36:42.720 --> 36:49.840]  No, some of those things are distro-specific, or require specific users like Hrani, and you may need to comment it out.
[36:50.440 --> 36:53.240]  Do that. Keep fixing it until it works.
[36:53.840 --> 37:03.000]  Then you can view the actual logs, which, depending on how you set up your rules, is going to generate an event, or the proper ways to use AUsearch.
[37:03.060 --> 37:06.520]  You can look at the main page, do it all there. So, great.
[37:07.360 --> 37:16.180]  If you followed all these, you know, and we weren't there to talk and just actually worked through this, you'd be like, I have Audity installed. It's working. It's generating alerts.
[37:17.160 --> 37:23.560]  Gosh, it's annoying how it does it every time I use Sue, and I'm trying to be a good person and not use Root, and it's OK, because if I use Root, it's going to cause alerts anyway.
[37:24.840 --> 37:28.520]  So, hey, it's working. What next?
[37:30.320 --> 37:43.520]  You're going to install autobeat, because you want to, you know, you want to move those events and its logs into Kibana and Elasticsearch so you can visualize it pretty, you know, just your text files all day.
[37:43.960 --> 37:51.160]  So, you're going to point autobeats.config to autodesk.config, or you put those rules just from the GitHub into autobeat.yaml.
[37:51.840 --> 38:03.560]  And note, if you created it from scratch, or like you just extracted autobeat rather than installing it from the package manager, you need to make sure autobeat.yaml is owned by Root, otherwise it complies.
[38:04.040 --> 38:07.780]  You're going to start the service and get this exact error on the screen.
[38:08.960 --> 38:09.660]  Why?
[38:10.420 --> 38:14.100]  Well, it says, hey, failed to send it because it's already running.
[38:14.520 --> 38:15.620]  That's weird.
[38:16.680 --> 38:19.240]  I lied to you. You need to stop and disable Audity.
[38:19.760 --> 38:25.540]  The problem with this is that autobeats ends up actually running its own version of Audity.
[38:25.760 --> 38:32.920]  So, it partially helps avoid some of those alerts that happen of any time it reads those logs.
[38:34.660 --> 38:37.060]  Additionally, Audity has its own formatting.
[38:37.560 --> 38:38.360]  Because, again, it's going to...
[38:39.080 --> 38:40.740]  Audity doesn't use JSON.
[38:40.900 --> 38:45.780]  It's just more typical Linux formatting stuff, you know, space, tabs, all that.
[38:45.780 --> 38:54.820]  As a result, Audity is going to take longer or cause more overhead for Logstash to parse and cause you, if you have to figure Logstash, a lot of grief.
[38:55.500 --> 39:01.460]  And sometimes if Audity and autobeats are both on, it can cause a kernel panic. That all sucks.
[39:01.760 --> 39:03.260]  So, why did you bother?
[39:03.960 --> 39:10.780]  Do you really want to see all the times when your Audity rules fail in your Elastic?
[39:11.640 --> 39:17.600]  Probably not because it's going to confuse you the next morning or the next analyst who looks at it and goes, oh, God, it's broken.
[39:17.760 --> 39:26.160]  You want to test it locally to avoid debugging, you know, through it and having to figure out where the error is and without freaking out your fellow analysts.
[39:27.600 --> 39:36.380]  So, again, install Audity, test it, disable it and stop it, install autobeats, get it working, and life will be good.
[39:37.940 --> 39:42.780]  So, now you may be wondering, what did we just cover? Oh, God.
[39:43.400 --> 39:46.000]  First, Elastic and Kibana are different.
[39:46.760 --> 39:50.800]  Hopefully, you won't make the same mistake if you're guilty of saying it sometimes.
[39:51.720 --> 39:55.180]  Logstash will make your Kibana queries way better, I promise.
[39:56.140 --> 39:59.780]  Feeding in custom documents is possible and actually not that bad.
[39:59.800 --> 40:03.800]  But the first time, like anything, new experiences suck, after that's fun.
[40:04.400 --> 40:07.040]  Audity is incredibly painful to set up.
[40:07.320 --> 40:11.080]  Actually, no, tuning it is extremely painful, but it's worth it.
[40:11.080 --> 40:14.920]  Because, again, don't be blind on your Linux systems.
[40:15.900 --> 40:22.780]  And the lesson I want to briefly hint at you in this lesson or this slideshow, YAML sucks.
[40:22.820 --> 40:29.980]  Probably 90% of your errors throughout all of this will be white space issues because you are dirty and use spaces and types interchangeably.
[40:30.520 --> 40:33.160]  And if they're not set the same, life sucks.
[40:35.140 --> 40:38.020]  So, if you want to see these slides, they're actually on my GitLab.
[40:39.280 --> 40:40.600]  You can grab it from there.
[40:41.300 --> 40:45.860]  And I'll take a few minutes to answer any questions, but if not, I'll be in the chat for a bit.
[40:46.140 --> 40:49.940]  And if you don't think of it until 30 days from now, whatever, ask me on Twitter.
[40:50.860 --> 40:51.800]  That's all I have.
[40:53.500 --> 40:57.720]  Thank you very much, Dr. Pinky, for that wonderful presentation.
[40:57.720 --> 41:08.460]  As always, we encourage you to join our Blue Team Village Discord server and ask questions in Text Talk Track 1.
[41:08.460 --> 41:15.900]  I will try to take a quick look here and see if we have any questions for you.
[41:17.600 --> 41:20.160]  And I am not seeing anything.
[41:20.160 --> 41:31.120]  So, with that, if you do have questions, feel free to direct them to the presenter as she will be hanging around for a little bit.
[41:31.120 --> 41:32.540]  Thank you again.
